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EDUCATIONAL PSYCHOLOGY 
BY V. A. C. HENMON 


University of Wisconsin 


t The accompanying bibliography from April, 1929, to April, 1930, 

includes references of interest to the educational psychologist, exclu- 
“1 sive of intelligence tests, educational tests, statistical methods, and 
of 


tests of personality or character. 


1. General Treatises. 


Texts for use in college or normal school classes in educational 


- psychology covering in the main the usual topics in such courses are 
bes provided by Pintner (131), by Collings and Wilson (33), and by 
Gast and Skinner (60). For use in classes in child study are the 
= revised edition of Kirkpatrick’s well known text (95), the new texts 
= by Strang (159), and by Swift (160). Here also would probably 
belong the study of the child’s conception of the world by 
—~- Piaget (130) and the treatment of the process of human behavior ; 
als by Sherman and Sherman (148). The psychology of adolescence is 
comprehensively treated by Brooks (23), and certain aspects of 
* ot adolescence in the more restricted presentations by Schwab and 


Veeder (145) and by Wheeler (179). The twelfth edition of 
Spranger’s well known book (152) should be mentioned. 

Books treating more specifically of the application of psycho- 
logical principles to methods of teaching include Burton (25) on 
the nature and direction of learning, Laton (100) on applications to 
health education, Mossman (118) on principles of teaching and learn- 
ing in the elementary school, Palmer (127) on directed learning, and 
Thorndike and Gates (165) on elementary principles of education. 
Two additions to the books on the psychology of elementary school 
subjects are by Garrison and Garrison (58) and by Schmidt (143). 
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The practical application of the principles of individual psy- 
chology in school procedure are set forth by its chief exponent, 
Adler (2), and for the applications of psychoanalysis in education 
there is the third edition of Pfister (129). On psychological theory 
in relation to education, attention should be called to the analysis of 
conflicting theories of learning by Bode (19). Laycock (101) 
accepts the basic laws of Spearman on the nature of ability and sug- 
gests the rewriting of the psychology of school subjects. Zeman 
(190) discusses the practical significance of eidetic imagery and 
eidetic psychology. 

The educational psychologist will find of special interest and 
value certain chapters in the Foundations of Experimental Psy- 
chology, and they are separately listed in this bibliography. They 
are the sections on nervous mechanisms in learning by Lashley (99), 
on experimental studies of learning by Hunter (89), on general 
ability by Pintner (132), on special abilities and their measurement 
by Freeman (54), and on the individual in infancy by Gesell (65). 
He will also find certain chapters worth consulting in the volume on 
applications of psychology by Moss (117). 

Among foreign systematic presentations of educational psychology 
are the volumes by Doring (46), by Kroh (98), by Schulz (144), 
and by Simoneit (149). Raup (138) gives an interesting character- 
ization of psychology and education in America. 


2. General Psychology of Learning. 


Those interested in the formulation of the laws of learning in 
terms of the conditioned reflex should consult the discussions by 
Hull (88) and Williams (181). Carroll (29) and Stephens (155) 
take up again the perennial problem of the law of effect. A com- 
parison of human adults and animals in maze learning is reported 
by Husband (91) and of blind as compared with sighted children 
by Knotts and Miles (96). Manual skills are studied by 
Gemelli (62) and Pear (128). 

A test of Ebbinghaus’ results on the proportional effect of over- 
learning with negative results is given by Cuff (39). The compara- 
tive retention values of maze habits and nonsense syllables by 
McGeoch and Melton (105) shows surprisingly enough greater 
retention of syllables. Cheng (32) and Harden (78) studied retro- 
active effect in relation to degree of similarity. The moot problems 
of whole and part learning and the relation of initial performance 
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to final performance are attacked by Crafts (36) and by Hart- 
man (80), respectively. 

The problem of motivation comes in for various studies. Craw- 
ford (37) studies the general problem of incentives to study on a 
basis of student opinion. In experimental studies Bills and 
Brown (13) find that the greater the amount of work for which 
the individual is set, the higher the initial level of efficiency, 
Deputy (44) that the effect of frequent knowledge of success 
depends on student attitude and that there is not the significant 
improvement shown in laboratory experiments, Jones (94) that 
words with high emotional values are better remembered, and Jer- 
sild (92) that under certain conditions examination is an aid in 
learning. Maller (108) reports an experimental study of coopera- 
tion and competition. Sanderson (142) reviews the literature on the 
effect of the mind’s set in motor learning and gives additional experi- 
mental evidence. 

Elusive problems are those attitudes and interests and other 
measurement. Thurstone (167) discusses a new method for measur- 
ing attitudes to take the place of correlational methods. Duffy’s 
monograph (49) on tensions and emotional factors in reaction, the 
study by Freeman (55) on the influence of attitude on structure in 
learning, the measurement of interest differences between students 


of agriculture and engineering by Remmers (139), and the further 
studies of children’s interest in collecting by Witty and Leh- 
mann (184) are contributions in this field. 

Factors conditioning mental work are investigated in the studies 
‘-by Pollock (133), by Stainer (154), and by Weber (174). The 
effect of: the summer vacation on achievement of pupils is shown 
by Nelson (122) and by Morgan (116). 


3. Psychology of School Subjects. 


1. Reading. Further eyemovement studies by Tinker and Pater- 
son (169) on the effect of length of lines with size of type and con- 
stant show that an 80 mm. line gives the fastest reading. Miles and 
Segel (111) report a clinical method of studying eyemovements in 
the rating of reading ability not involving photographic registration, 
and Miles and Bell (110) study eyemovements in relation to study 
habits. Hovde (87) finds that in a study of effects of size of type, 
leading, and context, that reading rate is determined almost entirely 
by context. 


Analyses of processes involved in reading are given systematically 
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by Book (20,21), experimentally by Pyle (135) and Dawson (42). 
German investigations of reading of note are by Thorner (166), on 
the reading of nonsense material and by Heimann and Thorner (82) 
on the reading of meaningful material. 

Reading interests and habits of adults are surveyed in the volume 
by Gray and Munroe (73) and of various adult groups in the study 
by Grace (72). Reading interests of 487 high school girls are 
reported by Elder and Carpenter (50). 

Two interesting studies of non-readers and remedial procedures 
are given by Dearborn (43) and by Stone (158). Pressey and 
Pressey (134) and Carroll and Jacobs (30) show striking gains in 
experiments on training college freshmen to read. Berry and Stod- 
dard (12) report an experiment with 172 lispers showing marked 
improvement after eight months training. 

2. Handwriting. Conrad and Offerman (35) report a motion 
picture study of how manuscript writing can be acquired, Gates and 
Brown (61) find print script to be learned a little more rapidly than 
cursive writing in the first half year, while cursive writing seems 
clearly a more rapid form for grades four to six, Guiler (75) shows 
methods of securing improvement in handwriting, and New- 
land (123) analyzes the illegibilities of arabic numerals. 

3. Spelling. Horn (86) finds no hard spots in words common 
to pupils and infers that the learning of each word is an individual 
problem. Abernethy (1) analyzes the eyemovements of good and 
poor spellers. Book (22) shows how special disability in spelling 
of a twelve-year-old was diagnosed and corrected. Guiler (76) 
compares oral recall, written recall and multiple choice in testing 
spelling. Stone (157) criticizes interestingly spelling texts. 

4. English. Symonds and Lee (161,162) continue studies in 
learning English expression with investigations of capitalization and 
the growth of vocabulary in written composition. Leonard (102) 
finds large gains in four exercises in proofreading, error correction, 
and dictation on ability in composition with reference to capitaliza- 
tion and punctuation. Bennett (110) finds brief drill periods better 
than discussion periods of reasons for using certain expressions in 
developing language ability. 

5. Arithmetic. As usual the number of experimental studies in 
arithmetic is large. Buswell (26) summarizes recent investigations. 
Brownell (24) reports large gains from six weeks individual instruc- 
tion after diagnosis, Chase (31) shows diagnosis and treatment of 
common difficulties, Guiler (74) gives methods of correcting errors 
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in four fundamental operations, Mitchell (112) finds distinct dif- 
ference in favor of specific over general problems, Morgan (115) 
similarly specific diagnostic and remedial exercises better than gen- 
eral material, Renwick (140) provides special methods for securing 
accuracy in concepts in mensuration, Stainer (154) determined the 
time of day when most work in arithmetical computation was done, 
and Wheat (178) found no significant advantage for imaginative 
over conventional types of problems. 





4. Controlled Experiments on Methods 


Archer (5) in a study of spelling investigated the transfer effect 
of the study of words upon selected derived forms and on words 
similarly constructed but not derived. Melby and Lien (109) made 
a study of two methods of teaching, employing not one control and 
one experimental group but several. Russell and Long (141) in 
comparing two methods of instruction in mathematics, find individual 
instruction superior. Simpson (150) found distinct gain from train- 
ing in answering questions, evaluating, outlining, and summarizing 
on ability to read and study history. Washburne (173) likewise 
found inclusion of questions resulted in a significant difference in 
recall and understanding of historical material. Willard (180) 
reports the Dalton plan superior to daily recitation procedure in 
learning history. 


5. How to Study 


Muse (120) provides a manual for use of college students on 
study habits. Woodring and Flemming (186) issue a volume on 
study for high school pupils. Althaus and Gilliland (13) in a con- 
trolled experiment find that instruction in how to study does not 
improve school work. 















6. Visual Education 

Wood and Freeman (185) give the results of their studies car- 
ried on for the Eastman Company. Knowlton and Tilton (97) set 
forth the results of experiments with the Yale chronicles in history 
teaching. Lewerenz (103) shows the results of a visual education 
lesson taught with the aid of flat pictures taken from the Yale 








n chronicles. 
. 7. The Pre-School Child 
Interest in the pre-school child grows apace, to judge by the 


increasing number of books, monographs, and articles in this field. 


ip tm A i i A en it ~ 
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The large volume by the National Society for the Study of Educa- 
tion on pre-school and parental education (121) heads the long list. 
Books giving systematic presentations of the psychology and educa- 
tion of the young child are by Bernfeld (11) based on Freudian 
psychology, by Blatz and Bott (15), by Drever and Drummond (41), 
by Faegre and Anderson in their revised volume on child care and 
training (51), by Goodspeed and Johnson (70), by Muchow (119), 
by Pyle (136) and by Tilson (168). In addition is the chapter by 
Gesell (65) referred to above. Monographs include the study of 
the acquisition and interference of motor habits in young children 
by McGinnis (106), experimental studies in pre-school education by 
Pyle and Murphy (137), and the study of the smiling and laughing 
of infants in the first year of life by Washburn (172). Speech 
development in the early years has been studied by Foulke and 
Stinchfield (53) and by Haggerty (77). Articles on various problems 
of early development are by Gesell (64) on maturation, by Gesell and 
Thompson (66) on learning and growth in identical twins and by 
Weiss (176) on the measurement of infant behavior. 
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General Books. Only one book dealing exclusively with intel- 
ligence testing has come to the writer’s attention during the past year. 
This is a German book by Hylla (68) which gives a description of 
tests suitable for German schools. It includes a translation of the 
Stanford Revision, and gives a brief summary of the results of 
testing at Hamburg, Berlin and Leipzig. It describes the first 
German “ Testheft” constructed by Bobertag and Hylla in 1925 
following the American model of a group test booklet. The author 
claims this test booklet to be as good as the longer and more com- 
plicated separate tests generally used in Germany. The book gives 
a general discussion of intelligence, standardization, the I.0. and 
elementary statistical techniques. It is written for the use of German 
teachers. 

Thorndike (153) has written the article on intelligence tests in 
the new Encyclopedia Britannica, and Pintner (127) contributes a 
chapter on the same subject in the Foundations of Experimental 
Psychology, edited by Murchison. A chapter on the history of 
intelligence testing is included in Murphy’s (110) general history 
of modern psychology. 












Several books contain one or more chapters on intelligence test- 
ing. Brooks (17) deals very thoroughly with the problem of the 
growth of intelligence, Levine and Marks (86) interpret intelligence 
in terms of Gestalt and give samples of different intelligence tests, 
Thomas (152) summarizes the psychometric approach to the study 
of the child, and Moss (109) emphasizes individual differences ir 
intelligence. Very brief and inadequate chapters on intelligence 
testing appear in the books by Garrison and Garrison (49) and by 
Greene and Jorgensen (55). 

Bibliographies. Pintner (126) gives the usual yearly summary 
in this Journal. Louttit’s (95) bibliography on sex differences 
covers intelligence tests. 

The Meaning of Intelligence. Hazlitt (58) devotes the major 
part of her book to a discussion of the meaning of intelligence. 
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“ Intelligence is the problem-solving organization of the mind,” and 
the general factor in intelligence is the measure of the degree to 
which experiences influence one another or are confluent. She con- 
structs one or two new tests arising out of this idea of intelligence 
and gives the results of testing some college students. Spear- 
man (147) contributes a brief statement of his theory to the new 
Britannica, and he (146) also claims that Kelley’s results in the 
“Crossroads” support the Spearman theory. Asher (8) examines 
intelligence tests by means of the Spearman tetrad criterion. 
Stern (148) elaborates on his well-known definition of intelligence 
in a theoretical article on the meaning of general intelligence. 

The Relations of Intelligence. McClatchy (96) discusses the 
concept of social intelligence and finds a correlation of .16 between 
the George Washington Test and rankings of social adaptability, and 
a correlation of .53 with the Colgate Introversion-Extraversion Test. 
Hovey (65) finds very low correlations between Army Alpha scores 
and three measures of extraversion-introversion tendencies. 
Oates (115) finds no correlation between intelligence and ratings for 
temperamental qualities but a correlation of +.56 between school 
marks and temperament (with intelligence constant). He concludes 
that persistence is the most important factor in determining school 
success. Hartshorne and May (57) in their extensive study on 
service and self-control find a correlation of +-.16 with intelligence. 
Intelligence and persistence correlate +.15; intelligence and self- 
control is zero. Schneider (140) gives correlations between intelli- 
gence tests and the Rorschach Test of interpreting standard ink 
blots. He believes this test to be better than the usual intelligence 
tests in measuring the abilities of repressed children, and supports 
his contention by case studies. Mackaye (100) maintains that intel- 
ligence is part of a total organic attitude involving emotional 
conditions, etc. In cases of emotional instability intelligence is 
subordinate to the total organic attitude and this must be taken into 
account in the classification of pupils. Meili (102) discusses the 
factor of nervousness and emotional difficulty during testing and 
expects the examiner to counteract such disturbances. Earle (36) 
gives seven performance tests to 570 children between C.A. 13-9 
and 14-1. Detailed results and sex differences for each test are 
shown. Performance tests do not measure intelligence as well as 
verbal tests. They correlate .79 with Binet and .60 with a group 
test. There is no group factor of non-verbal ability and hence such 
tests should not be combined. Pyle and Snadden (132) compare 
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high and low I.Q. cases in high school on thirteen different tests. 
They find an overlap on six tests and none on seven (mostly idea- 
tional learning tests), and they conclude that most abilities are very 
specific and that there is danger in pooling the results of several 
tests. Conrad and Jones (22) suggest that an intelligence test might 
be constructed by means of questions based on motion pictures, 
which would be very serviceable for the testing of adults. They 
find correlations of .68 and .71 between such tests and Army Alpha 
scores based on over 100 cases, ages 10 to 54. The age curves for 
the Alpha and picture test scores are very similar. The reliability 
of the picture tests is .90 to .93. Bobertag (11) gives three intelli- 
gence tests each four months apart to the same pupils and corre- 
lates these scores with the teacher’s ratings, a new rating being 
made for each test. The correlation of test with teacher increases 
from .53 to .66 to .78, while the correlation of test with test does 
not change. This shows the test rating to be more valid than the 
teacher rating. Atkinson (9) correlates intelligence scores with 
scores on learning tests at different stages of the learning. He finds 
that “intelligence seems to be involved more in the preliminary 
adjustment to relatively simple situations, less in subsequent achieve- 
ment and least in the mechanical limits of ability.” He differen- 
tiates between learning and intelligence, and finds that the more 
complex a response, the more novel a situation, the more it requires 
intelligence. In a similar manner Hertzberg (62) finds that motor 
dexterity alone is of no value for predicting M.A. in his study of 
the motor ability of 46 kindergarten children. With reference to 
mechanical ability Kefauver (75) finds little difference between the 
correlations for 101 cases ranked according to shop work and abstract 
tests of intelligence and mechanical tests. The coefficient with shop 
work for the Terman Group Test is .22; for the McQuarrie Test 
30; for Stenquist Mechanical I and II .45. Art ability as tested 
by the Lewerenz (88) Art Test seems to have no correlation with 
the usual intelligence test. Uhl (157) in a study of 822 high school 
students finds negative correlations from .01 to .08 between intelli- 
gence and time spent on studies; mostly positive correlations with 
time spent on educative activities out of school; and all positive cor- 
relations with time spent on extra-curricular activities. He con- 
cludes that “superior students are not taxing their capacities as 
heavily as slower students.” 

Growth and Constancy of Growth. Thurstone and Acker- 
son (155) from a study of 4,208 Binet tests with children from 
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C.A. 3 to 17 arrive at a growth curve positively accelerated in the 
younger years with an inflection point between ages 9 and 12 and 
negatively accelerated thereafter. This extraordinary curve runs 
counter to all previously suggested growth curves. Odom (118) 
studies the results of group intelligence tests and arrives at the usual 
curve showing negative acceleration, but with no sign of an upper 
limit at C.A. 13 or 14. The shape of the curve of intelligence after 
childhood is studied by Mursell (111) who takes the results ofa 
great many intelligence tests given to reformatory inmates and 
classifies them into age groups of 5 years from 10 to 74. He finds 
a steady decrease in I.Q. from the twenties onward, a decrease from 
I.Q. 89 or 90 to about 67, the biggest drop being from age group 
45-49 to 50-54. With idiots, Moore (105) believes that mental 
growth ceases about C.A. 10. This is based on re-tests of 51 idiots. 
The median change in I.Q. per month is about —.063. Adams (2) 
gives a description of a case tested 21 times from 1911 to 1926, 
showing a gradual decrease in I.Q. from 86 to 54. Woyczyn- 
ska (178) reports the results for 942 tests of 446 children with the 
Bobertag revision of the Binet. The interval between tests varied 
from 6 months to 6 years. The I.Q.’s were mostly below 100. The 
correlation for these 942 tests is .87; the median difference in I.Q. 
is .02, and the interquartile range from +5.8 to —3.6. These 
results are very similar to those found in this country. 

Influences upon Intelligence Ratings. Working in the presence 
of others or in groups has no influence upon intelligence test ratings 
of university students according to Farnsworth (41). Marine (103) 
does not find that being familiar with the examiner influences the 
Stanford 1.0. of young children. To establish familiarity she 
devoted a certain amount of time to playing with the children; the 
unfamiliar children she had never seen before the test period. She 
found no reliable difference in 1.0. between the two groups pre- 
viously equated for intelligence. Van Alstyne (159) worked with 
75 three-year-old children and measured many environmental factors 
in a fairly objective manner. She found all degrees of correlation 
between these factors and the intelligence of the children, some being 
highly correlated therewith. Chauncey (19) gives correlations 
between Sims’ Score of home environment with Stanford Achieve- 
ment (.30 and .35) and also with Multi-mental (.21 and .19) for 
113 children in grades VIII and IX. There is no correlation accord- 
ing to Viteles (162) between score on the Brown University Psy- 
chological Test and age of pubescence (first menstruation) in a 
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report on 236 cases. Westenberger (170) gives intelligence and 
achievement tests at the end of the first semester and at the end 
of the school year to 404 children in grades II to VIII. After a 
thorough medical examination forty per cent of the poor group 
received medical attention, but no difference in intelligence or 
achievement is found between the treated and untreated groups. 
Common minor defects seem to have no influence on intelligence 
or achievement. Kempf and Collins (77) find that the average 
number of physical defects decreases with increase in I1.Q., and ear 
defects show much the greatest increase with decrease in 1.Q. “ The 
relationship between I.Q. and physical defects appears to be of a 
general rather than a specific nature.” 

Scales and Individual Tests. No new intelligence scales for 
individual testing have come to the reviewer’s attention. Kent and 
Shakow (79) describe a revision of their previously published 
Worcester series of form-boards. Kovarsky (82) presents a revi- 
sion of the Rossolimo tests. She reports that Rossolimo first 
described his tests in 1909. His method is an analytic method as 
opposed to the synthetic method of Binet. In the Kovarsky revision 
there are 28 series of tests divided into tests of attention, will, 
memory, judgment and so forth. The testing of a case takes from 
2% to 5 hours. Each series is scored on a ten-point scale and drawn 
on a profile chart. The equation for the profile (P) = t (attention) 
+r (memory) +s (higher capacities). The profile for imbeciles 
lies between P=2 and P=4; that for morons lies between 
P=5 and P= 8. Types of children are distinguished according 
to the relationship between the levels of the three different terms 
of the equation. An analysis of the responses to the Healy Pic- 
torial Completion Test II is given by Dorcus (33), and Rem- 
mers (136) shows how the mean M.A. for the different parts of the 
Herring Revision fluctuates from part to part in a study of 101 
freshmen tested on this test. 

Group Tests. Kent and Shakow (78) present a standardization 
of several tests which can be combined by the median M.A. method 
and are specifically designed for clinical use. Schultz (141) gives 
the Viteles T 100 test to 392 cases in grades VI to VIII and finds 
a reliability of .91 and a validity between .80 and 91. He finds the 
test a useful one for a vocational bureau. Vance (160) describes 
a reasoning sub-test of the Iowa modified Alpha for college students. 
Doll (32) describes three performance tests used for group testing 
of illiterates. Walters and Thomas (166) present a standardization 
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of Spearman’s measure of intelligence for schools, and Piéron (123) 
takes a French translation of Mira’s Barcelona Test, gives it to 3,017 
subjects, ages 13 to 20, and presents decile norms for each age and 
sex. She finds the normal school students slightly superior to the 
lycée students. The scores increase from age 13.to 20 and she 
concludes that intelligence increases with age among adolescents who 
continue in school. Two studies compare primary intelligence tests. 
Sangren (139) studies seven such tests with 100 Grade I children. 
The tests are ranked as to nineteen criteria and the final ranking 
is Haggerty Delta 1, Pressey, Pintner-Cunningham, Detroit, and so 
forth, McGraw and Mangold (99) use ten first grade tests given 
in 1923-4 and correlate them with the Otis Group given five years 
later to the same children. Dearborn correlates .60, Pintner-Cun- 
ningham .55, Otis Primary .53, Haggerty .53, and so on down. They 
also present intercorrelations of all the tests and correlations of 
each test with the average of all others. Witty and Taylor (175) 
compare the Multi-mental with the Binet M.A.’s of 522 children in 
Grades IV to VI, and find a higher correlation for the Multi-mental 
with the Stanford Achievement than for the Binet. Odates (116) 
gives three English group tests to 270 boys, age 11 to 18, and pre- 
sents intercorrelations, as well as correlations with teachers’ esti- 
mates (from —.32 to +.71, by grades) and with school examinations 
(—.10 to +.62, by grades). An examination of the sub-tests by 
the ratio of the young bright to the old dull leads the author to 
suggest that the best sub-tests for innate intelligence are Number 
Series, Absurdities, Best Reason, Cipher, Analogies and Common 
Sense; whereas the poorest are Mixed Sentences, Orientation, Obey- 
ing Orders and Meanings of Words. Two studies point out that 
the significance of I.Q.’s may vary from test to test according to 
the method of standardization. Kefauver (74), therefore, constructs 
a table of equivalent I.0.’s for ten group tests, and Cole (20) gives 
T scores for three well-known group tests. 

The School Child. Purdom (131) presents a study of homo- 
geneous grouping, and finds that such sections do not gain more 
either on standard tests or on semester grades. Furthermore, no 
degree of intelligence shows an advantage with homogeneous group- 
ing. And Bonar (12), likewise, in a smaller study finds no differ- 
ence in reading tests at the end of one year between a high LQ. 
group (av. I.Q. 106), a low LQ. group (av. I. Q. 97) and a mixed 
I. Q. group (av. I. QO. 98). Lincoln (93) summarizes all the pub- 
lished results on homogeneous grouping and finds that no effect can 
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be expected unless the teaching is modified. One group intelligence 
test is not sufficient for ability grouping. Kefauver (73) discusses 
various ways of sectioning according to ability, and advises a com- 
posite of (1) teachers’ estimates or marks, (2) intelligence tests, 
(3) achievement tests. Abelson (1) compares the A.Q.’s of six 
groups in Grade VI under the Dalton Plan, and concludes that this 
plan is probably good for dull pupils. Mort’s (108) treatment of 
pupil management in school includes a discussion of the use of 
intelligence and achievement tests for classification, guidance, pro- 
motion and the like. 

Lincoln (91) compares the standing of nine schools on intelli- 
gence and educational tests and finds a high correlation between 
them. Van Wagenen’s (161) extensive survey of six groups of 
schools with both intelligence and achievement tests gives compari- 
sons of each subject with mental age constant. In general he finds 
that the length of the school year has a decided influence on school 
achievement. The nine-months’ school is better than the eight- 
months’ school by about one-third of a year in achievement. Kempf 
and Collins (77) report the 1.Q.’s of 5,000 children in two very 
different Illinois counties. The correlation between I.Q. and school 
marks for 3,747 cases is +.39. The median I.Q. for disciplinary 
cases is 95 (mn = 400), for non-disciplinary cases 101 (n = 4,288). 
The median I.Q. for the northern county is 102, for the southern 
88. In the northern county the urban I.Q. is 103.5, the rural LQ. 
95; in the southern county the urban I1.Q. is 91, the rural 1.Q. 8&4. 
Urban native whites in one county obtain a median I.Q. of 108; 
rural native whites 99. Wood's (176) report of testing in private 
schools in the East agrees with several previous studies, showing 
a median 1.Q. of 118 for 1.157 children and this is contrasted with 
a median 1.Q. of 98 for 1,057 city public school children. Gerberich 
and Stoddard (51) give some results of a survey of Iowa High 
School Seniors. They show that the upper decile in intelligence is 
made up of 887 cases, of which 59 per cent are male and 41 per cent 
female, and, furthermore, the age distribution of this group shows 
a much younger C.A. than that of the total group. Their test battery 
and first semester grades correlate from .50 to .54. In England 
Marsden (101) gives the results of Binet and group tests in remote 
country schools in Yorkshire and the South-West of England. He 
gives a distribution of I1.Q.’s for the 197 cases tested. He finds 
that schools differ very much in reference to the total number of 
superior children. Comparing his results with the Northumberland 
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survey, he does not find as many superior I1.Q.’s. In Germany, 
Giese (52) gives the results of 8,000 children tested by means of 
group intelligence tests and compares different types of schools. He 
finds the Volksschule has the lowest mean rating. He finds a gen- 
eral increase in score by age from 14 to 17. He finds no sex differ- 
ences. He finds differences between towns and between different 
courses of study leading to different occupations. 

The College Student. Edgerton and Toops (37) make an impor- 
tant contribution to the use of intelligence tests in college adminis- 
tration. Of 1,958 freshmen entering Ohio State University in 1923 
only 18 per cent graduated within four years and probably only #4 
per cent will ultimately graduate. A very large proportion of those 
with poor intelligence ratings did not graduate in four years. The 
intelligence test correlates .45 with scholarship ; intelligence plus high 
school record correlates .56. The best ten per cent in intelligence 
earned 1.78 as many hours credit and 2.46 as many honor points as 
did the poorest ten per cent. The authors believe that the university 
is not securing the attainment from the superior student that is 
clearly possible, and this is particularly true of women students. 
Hammond and Stoddard (55) give detailed results of the lowa 
Training and Aptitude Tests given at the University of lowa. Craw- 
ford (24) finds very little difference between the mean score or 
S.D. of those who were candidates and those who were admitted to 
Yale on the scholastic aptitude test. The correlation between intelli- 
gence and time spent in study is —.20 (n= 1,166). He presents 
the test ratings according to occupation of parent, showing profes- 
sions highest and business lowest. Brigham (15) writes the fourth 
annual report on the scholastic aptitude tests, discusses item analysis 
and the validity of the test. The correlation with marks is .51 to .55 
(n= about 300). Kellogg (76) contrasts intelligence tests and 
matriculation examinations with reference to their correlations with 
students’ marks. Eells (38) finds that students entering Stanford 
from junior colleges score higher on the Thorndike Test than the 
regular Stanford student. Dexter (31) finds that students majoring 
in English have a higher average intelligence score than those 
majoring in other subjects, and West (169) gives the results of 
testing the students at Battle Creek College. Freeman (45) inter- 
views 68 sophomores who ute high on tests and low on grades or 
vice versa, in order to find the factors that reduce correlation, and 
Jones (69) studies 40 students scoring low on Army Alpha, 82 per 
cent of whom failed to maintain an average mark of C. He finds 
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such students poor college risks. Constance (23) compares frater- 
nity and non-fraternity freshmen, and finds no difference on the 
intelligence tests, but finds that the former secure slightly better 
college marks. 

There are several reports of intelligence testing in normal 
schools and teachers’ colleges. Frasier and Heilman (44) discuss 
the general use of the intelligence test and present some results. 
They say that the intelligence score is the most valuable single item 
in college administration. Four studies deal with the correlation 
between intelligence and teaching success. Pyle (133) finds that 
intelligence and college marks correlate .45, but intelligence and prac- 
tice teaching only .15; and, furthermore, when teaching success is 
based on the judgments of principals in actual school work this cor- 
relation drops to .03 for the first year, and .02 for the second year 
of actual teaching. Sorenson (145) discusses Pyle’s findings and 
calls attention to the homogeneity of his groups and the unreliability 
of the principals’ ratings. He presents some data of his own, where 
the Otis 1.Q.’s range from 86 to 123 with a mean I.Q. of 107.5. The 
correlation with age is —.62 (ages 16 to 55). But Frasier (43) 
analyzing the results of hundreds of normal school students finds 
no relationship between teaching grades and intelligence. There is 
no difference in the teaching grades of the top five and the bottom 


five per cent in intelligence. For 70 cases, Alpha and teaching grade 
correlate —.03. On the other hand, Broom (18) with 148 cases finds 
a correlation of +.30 between grades in practice teaching and the 
Thorndike Intelligence Examination, and this is higher than the 
correlation between teaching grades and grades in theory courses 


(r = .21), and he believes in the intelligence test for the determina- 
tion of a critical score. Condit (21) reports the results of 559 fresh- 
men and finds a correlation of .45 between academic grades and 
intelligence scores, and .50 between academic grades and achieve- 
ment test scores. Cuff (26) finds that intelligence increases from 
one college year to the next because of elimination of the less intelli- 
gent; 22 per cent of those eliminated fall below the 25th percentile 
in intelligence, as compared with 17 per cent of those retained. 
Lauer and Evans (83) report correlations of an intelligence test with 
high school subjects, and Sims (143) contrasts normal school fresh- 
men with college freshmen on various tests and deplores the poor 
intelligence, achievement and socio-economic status of the former. 
Kesselring (80) in Germany discusses entrance examinations for 
teacher training schools. He presents results for his own examina- 
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tion and for the Bobertag-Hylla adaptation of the Otis S.A., and 
criticizes the latter. It correlates lower with educational tests and 
teacher ratings, but it takes only 45 minutes, whereas his examina- 
tion takes 1% hours. 

The Superior. Danielson (28) gives a distribution of the 1.Q.’s 
of all children in special classes for the superior in Los Angeles. 
The average I.Q. is 134, and 23 to 27 per cent are above I1.Q. 140. 
Lincoln (92) studies 54 under-age children admitted to school on 
the basis of intelligence tests now that they have reached grades IV 
to VII. On standard educational tests they are +.33Q above the 
median. The bright are especially good on reading and poorest on 
arithmetic. Similarly, Washburne and Raths (167) find that bright 
young children admitted to school early have kept up with others in 
all studies. During six years they have shown less retardation and 
hence they become increasingly under-age. The writers find the 
policy of admitting under-age bright pupils to be fully justified. 
Working with children of I.Q. 150 and above in special classes 
Danielson (27) made a definite attempt to enrich their reading and 
found that their A.Q.’s rise over a two-year period. They come up 
to their intellectual expectation when given the chance. Kiefer (81) 
compares high and normal I.Q.’s on five motor tests and finds no 
difference between the means for ages 9, 10, and 11, and low cor- 
relations between these tests and intelligence. 

The Feebleminded and Problem Cases. Lewis (89) writes Part 
IV of the British Report of the Mental Deficiency Committee deal- 
ing with the incidence of mental deficiency. Group tests for a first 
selection were followed by Binet and educational tests, and age 14 
was used for the calculation of 1.Q.’s of those aged 14 and above. 
Only those below 60 I.Q. were regarded as feebleminded. The total 
defective children in the six areas studied was 0.85 per cent; urban 
0.67 and rural 1.05. The incidence for idiots was 30 per cent higher 
for boys than for girls. The estimated incidence for the total popu- 
lation of Great Britain is 0.73 per cent. This is about twice that 
reported by the 1906 Commission. This is due to a more thorough 
survey and to increased longevity, but also, the author believes, to 
an increase in birth rate of the mentally deficient in rural areas. The 
percentage of idiots has also doubled, and idiots would not likely 
have escaped the 1906 survey. The present report undoubtedly 
represents the most thorough study of the incidence of mental 
deficiency over a large area. Popenoe (128) in the United States 
tries to estimate the percentage of the population having an I.Q. 
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of 70 or less, basing his findings on various surveys of schools, and 
the like. He concludes that 4 per cent fall below 1.0. 70. Contrast 
this with the British figure of 0.73 per cent below I. Q. 60, mentioned 
above. Popenoe points out that his estimate would mean about five 
million feebleminded in the U. S. A. and only sixty thousand of 
these are at present in institutions. Wallin (165) gives a report of 
all types of special class children in the schools of Baltimore. The 
Binet results for 201 white and 79 colored show 24.3 per cent of the 
former and 73.4 per cent of the latter have I.Q.’s below 68. 
Town (156) summarizes the results of five years’ work in a psycho- 
logical clinic. Of 2,536 children tested, 11.9 per cent were feeble- 
minded. Of 695 behavior problem cases, 24 per cent were 
feebleminded; of 75 unmarried mothers, 29 per cent were feeble- 
minded. Ellis (39) gives the results of intelligence tests at a 
feebleminded colony, and finds a range of M.A.’s from 2 to 12, with 
the median at 7. 

Compiling the results of all cases brought to a clinic, Bridg- 
man (13) finds that out of 3,675 cases, 1,940 were males and 1,735 
females. The distribution according to degree showed slightly more 
male idiots than female (1.5 vs. 1.2 per cent), but more female 
imbeciles than male (6.9 vs. 5.7 per cent), and a much larger per- 
centage of female morons. Morgan’s (106) study of the age of 
menstruation of 138 feebleminded girls shows them to be somewhat 
retarded. Brousseau and Brainerd (16) have written a book on 
Mongolism, including all information about this type of feeble- 
mindedness. They discuss the many hypotheses regarding causative 
factors and reject age of mother and last born in family, and sug- 
gest as the causative factor an obscure disturbance of the ductless 
glands. For 206 mongols the 1.Q. mode is about 25. They classify 
their cases as 38 per cent idiots, 61 per cent imbeciles and | per cent 
morons. For 165 cases the average age of death is 14 and the 
oldest was 41% at death. Otis (120) formed a reading circle for 
a group of feebleminded girls, ranging in I1.Q. from 53 to 85, with 
a median at 68. She found that most of them showed improvement, 
and this increased their 1.Q.’s on a later Binet test, due particularly 
to increase in the vocabulary test. Yepsen (179) gave the Goode- 
nough Drawing Test to 37 feebleminded boys three times and found 
re-test coefficients of .89 and 91. The correlation with Binet was 
60. Anderson (4) asked teachers of special classes to classify 125 
cases as stable or unstable and found no marked differences in 
intelligence or achievement. Lincoln (94) finds that 502 problem 
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children brought to a clinic range in 1.Q. from 20 to 110 with a 
median at 75. McClure (97) sent ‘a questionnaire to teachers to 
list all their problem cases. Out of an enrollment of 26,364 chil- 
dren, 533 or 2.02 per cent were reported. The mean I.Q. was 90.1 
for 499 usable cases, made up of 416 boys and 83 girls. The 
author compares the high and low I.Q.’s as to types of misconduct. 

Rabinovitch and Rossolimo-Savitch (134) give the results of a 
comparison of Binet and Rossolimo tests of 355 abnormal children. 
There were 183 morons, but according to the Binet test 96 tested 
normal while none tested normal according to the Rossolimo test. 
The correlation for these cases between the two tests was .48. Of 
46 imbeciles some tested normal according to the Binet. The authors 
are very critical of the Binet Scale. They regard it as good for 
a first preliminary classification, but not for final diagnosis and of 
no use for showing special deficiencies, as is the case with the 
Rossolimo Test. Geissler (50) reports the use of the Rossolimo 
Test given according to Bartsch’s method to 15 cases in his special 
class for speech defectives. He gives profiles for each child, a class 
profile and a class equation, eg., P = (5.86+ 7.82 + 7.52) + 
24% V. Garfunkel (47) finds a larger per cent (96% ) of eidetikers 
among special class children than among normal (65%). There is 
no correlation between intensity of eidetic imagery and the intelli- 
gence scores on the Pintner-Cunningham Test. 

The Delinquent. Healy et al. (60) in their recent book find that 
85 per cent of delinquents of normal mentality and personality were 
successful in foster homes, whereas only 40 per cent of delinquents 
of abnormal mentality and personality were successful. Wieg- 
mann (171) in Germany contrasts 227 delinquents with 227 non- 
delinquents of the same age and some equivalence of schooling. He 
gives many intelligence tests and analyses the results of each test 
separately, finding the delinquents in general below the non- 
delinquents. Oseretzky (119) presents the results for 378 juve- 
nile delinquents in a Moscow “workhouse.” They are be- 
low the norm in physical measurements. On Binet Tests 
only 7 per cent test at age, the rest are one or more years 
retarded. Twenty and eight-tenths per cent are feebleminded 
according to the author’s diagnosis. In general 62 per cent are physi- 
cally and 43.1 per cent are mentally backward. Bridgman (14) 
describes four cases of murder brought before the juvenile court 
since 1914. The I.Q.’s are 71, 75, 85, 96. The author gives detailed 
case studies and concludes that two were influenced largely by envi- 
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ronmental factors, and two by native traits. Riddle (137) presents 
343 cases of juvenile stealing, and gives the median M.A.’s and I.Q.’s 
for different kinds of theft. Poull and Montgomery (129) believe 
that the Porteus Maze Test shows a difference between socially 
adjusted and maladjusted cases. The latter do poorer on the test. 
They are lacking in planning ability. But strangely enough they 
do not score poorer on Healy Completion I]. Mohr and Gund- 
lach (104) study 55 native white adult prisoners. They are superior 
on the Army Alpha to the Illinois draft, and resemble very much 
Murchison’s sampling of adult prisoners. They give the correlations 
between intelligence and various physical measurc. Raphael 
et al.(135) give a distribution of the 1.Q.’s of 100 cases of traffic 
offenders. The median is about 80, with a range from 46 to 118. 

The Deaf and Blind. Upshall (158) makes a very careful com- 
parison of deaf pupils in day schools and in institutions by means 
of 311 pairs matched for C.A. and intelligence score. Day schools 
attract the brighter pupils. They are better on educational tests, 
amount of hearing, age of becoming deaf and years in hearing 
school. Yet even when all of these factors are equated the day 
school is still slightly superior to the institution on educational 
achievement and the author is inclined to suggest that the teaching 
may be better. Pintner (124,125) discusses the relation between 
measures of speech, speech-reading, intelligence and achievement with 
deaf pupils. He finds no correlation between the first two measures 
and the Pintner Non-Language Test, but fair correlations between 
thern and the educational test. Speech is also positively correlated 
with amount of hearing and age of becoming deaf. Drever (35) 
in England reports briefly the results of testing 1,474 deaf children, 
ages 5 to 16, on his performance scale. The age medians for these 
deaf children are actually a little above the hearing norms which 
the author says were based on 200 hearing children “ very markedly 
above the normal.” He finds the boys decidedly better than the 
girls on his performance tests. A very wide and detailed research 
program for the deaf and hard of hearing has resulted from two 
conferences under the auspices of the National Research Council 
and this has now been printed (Anon. 7). 






Only one reference to the blind has come to the writer’s attention. 
Hayes (59) describes his new revision of the Binet Scale for the 
Blind. It is standardized on blind children. Their median attain- 
ment is below that of the seeing, in terms of 1.Q., probably about 
10 points below. 
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Racial Comparisons. Several studies involving negro and white 
comparisons have been published. Peterson and Lanier (122) 
report comparisons in Nashville, Chicago and New York on several 
intelligence tests. There is a marked difference between Nashville 
whites and negroes on three group tests (2.2 ©), but no real dif- 
ference on three ingenuity tests (0.8 ©). There is no difference 
between the New York whites and negroes but a great difference 
between Nashville negroes and New York negroes. The correla- 
tion between skin color and intelligence is about +.21. The authors 
are critical of our present tests as measures of race differences. 
Young (180) compares 323 whites with 314 negroes in Grades III 
and up, ages 9-10, in Louisiana. The children came from the same 
wards of the city. On the N.I.T. the white score is 70 as compared 
with the negro score of 40. The light negroes score 45 and the 
dark 37. On suggestion tests the negroes are about 1*/; times more 
suggestible than the whites. In Kempf and Collins’ (77) large sur- 
vey in Illinois the negro I1.Q. is 71 as compared with the white I.Q. 
of 88. Price (130) gives results for several different intelligence 
tests in eleven negro colleges. The percentage of negroes exceeding 
the white median varies from 9 to 43 for 7 negro colleges on the 
Otis Test. In terms of I.Q. the white median is 109 and the negro 
98. Davenport and Steggerda (29) write on race crossing in 
Jamaica. They are mainly concerned ‘with anthropological measure- 
ments but they report briefly resuits on some psychological tests. 
The number of cases varies much from test to test and little is said 
about the selection of cases for each test. In general they conclude 
that the negro is superior in sensory equipment, but the white is 
far ahead in planning. Gray and Bingham (53) in a report devoted 
chiefly to musical ability tests, give results for 258 colored and 219 
white on an intelligence test. The average index of brightness of 
the colored is 76 as compared with 108 for the whites. Incidentally 
it may be noted that these authors find the whites superior on 
musical tests, whereas Davenport and Steggerda, mentioned above, 
found the negroes superior. Viteles (163) gives a summary of test 
comparisons between negroes and whites, but presents no new data. 

Hughes (67) gave intelligence and achievement tests in three 
London elementary schools with a mixed Jewish and Gentile popula- 
tion, and found the Jewish children at almost every age from 8 to 
13 superior to the non-Jewish in both intelligence and achievement. 
When classified according to parental occupation the Jewish excelled 
the non-Jewish by 6 to 10 points of I.Q. in most groups. Similarly 
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Garrett (48) in this country studying 296 college freshmen found 
the Jews superior to the others on the Thorndike Intelligence Test, 
the Moss Social Intelligence Test and in college marks. After the 
Jews, on the Thorndike Intelligence Test, came the German, English, 
Italian and lastly the Irish. The Italians were last on the Moss 
Test and the Irish in college marks. The Thorndike and Moss 
Tests correlate .57, the Thorndike and marks .63. Lima (90) 
reports the results of group tests given in seven towns in Minne- 
sota and compares 10 nationalities, comprising 85 per cent of the 
population studied. The average I.Q. is 107. The English, Danish, 
Scotch, Swedish, Norwegian rank in that order above the average. 
The Dutch are at the average. Below come the Irish, German, 
Finnish and French. No nationality comprising less than two per 
cent of the population is included in this list. The Illinois county 
survey by Kempf and Collins (77) gives the following average 
I.Q.’s for nationality groups: native white 107, mixed 105, British 
105, Scandinavian 104, German 98, Polish 87, Italian 87. Les- 
ter (85) gave six performance tests to 26 foreign children in Grade I 
and found correlations ranging from .28 to .65 with the Binet and 
from —.05 to .54 with teachers’ estimates. Polish children did 
better than American on the Kohs, Knox Cube and Healy A, and 
he advises a combination of the two latter tests with the Binet for 


foreign children. Hsiao (66) reviews all the studies of Chinese and 
Japanese compared with Americans. He lists the tests on which 
each nationality is superior. Wood (177) gives the results of 120 
women students in Constantinople Women’s College. The mean 
1.Q. on the Otis Higher S.A. is 95.5 as compared with a mean of 
118 reported by Otis. The author gives separate means for four 
different nationalities. 


Inheritance. Wingfield (173) gives results for 102 twin pairs, 
(45 identical and 57 fraternal) and 29 orphans reared together for 
25 to 75 per cent of their lives. The correiation for all twin pairs 
is .75 (age constant), and the correlations for older and younger 
pairs are .71 and .77 respectively. The fraternal pairs correlate .70 
and the identical 90. For 15 pairs of orphans the correlation is 
.13. Twins are as much alike on educational as on intelligence 
tests, hence schooling does not make them more alike. New- 
man (112,113,114) has three articles each dealing with a pair of 
identical twins reared apart from early life. In general the twin 
having had better education and better home environment obtains 
a higher 1.Q. to the extent of 12 points. Holzinger (64) gives cor- 
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relations for mental and physical traits of 50 identical twin pairs 
(average r = .90) and for 50 fraternal like-sex twin pairs (average 
r= .60). He finds nature and nurture equally effective in pro- 
ducing mean twin differences in intelligence. McFadden (98) finds 
that the 1.Q.’s of siblings on the Terman and Kuhlmann tend to 
decrease from the youngest to the oldest child. There is no such 
decrease on the N.I.T., because it is too dependent on schooling. 
Freeman (46) discusses the general nature-nurture problem and 
finds it impossible to separate nature from nurture. Paterson and 
Williamson (121) take Pearl’s own data and, by using the Barr 
and Taussig scales, show that his so-called average social status is 
in reality much above the average, and, therefore, the data confirm 
the findings of Galton and Terman instead of contradicting them, 
as Pearl had argued. Allen (3) investigates the background of 
49 families containing gifted children (range of I1.Q.’s 133-190, 
mean 1.0. 157). Seventy per cent of the fathers are in the highest 
group of Taussig’s classification. The score on the Whittier Home 
Rating Scale is 23.3 as contrasted with 20.8 for unselected cases. 
Ten cases of insanity were found in the 2,800 forefathers reported, 
or about the same rate as in the population at large. The author 
finds that “these families are not being maintained by reproduc- 
tion.” Willoughby (172), however, finds that college stocks are in 
no danger of elimination, although there is a small negative cor- 
relation (r ——.3) between intelligence ratings and fertility. Suth- 
erland (150) pushes further the relationship between I.Q. and size 
of family by investigating children of fathers all in the same occu- 
pational group, i.e., miners. With 3,096 cases he reports negative 
correlations for different samples, r—=—.13; —.14 (girls only) ; 
—.12 (boys only). Hence, when occupation and social status is 
about the same, we find the same tendency for large families to 
have lower I.Q.’s, as we find in populations of mixed occupations 
and social status. Thurstone and Jenkins (154) find a correlation 
of —.09 between intelligence and size of family. Their study 
includes 10,000 case records of a clinic, ages 1 to 21. The distri- 
bution of I.Q.’s for birth order shows that the mean intelligence 
rises with order of birth and that this effect seems to be progressive 
as far as the eighth child. But Jones and Hsiao (71) find no dif- 
ference in intelligence due to birth order in a study of 614 pairs 
of siblings by calculating the mean difference in sigma score for 
adjacent pairs and non-adjacent pairs. Stroud (149) reports a cor- 
relation of —.16 for intelligence and size of family in a study 
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embracing 1,057 cases, ages 5 to 18, in rural regions in Georgia. 
He further finds a correlation of +.26 between intelligence and 
the tax assessment of the home, in spite of the homogeneity of the 
group in both variables. In Russia, Sirkin (144) gave group intelli- 
gence tests with a reliability of .96 to from 6 to 8 hundred children 
in each of Grades IV, V, and VI and found a rise of score with 
social level. His correlations for score and level for the three 
grades are .36, .43, .39. A second testing 14 months later gave 
similar results. Witty and Lehman (174) compare the heredity 
background of 50 cases 1.0. 70 and below with 50 cases 1.0. 140 
and above. The parents of the low I1.Q.’s are 78 per cent American, 
mostly laborers with no college education; the parents of the high 
1.Q.’s are 98 per cent American, mostly in business and 50 per cent 
have college education. The authors conclude that such results may 
be due to either hereditary or environmental influences. Heil- 
man (61) in a study of 828 ten-year-old children finds that 57 per 
cent of the variation in E.A. is due to M.A., 7 per cent to environ- 
ment and the rest not accounted for. The maximum amount of 
variation in E.A. due to school training is 19 per cent, and the 
remaining 81 per cent is perhaps due mainly to heredity. But only 
2 per cent of variation in grade location is due to M.A. and about 
31 per cent to school attendance. 

Employment and Guidance. An anonymous article (Anon. 6) 
describes the uses of intelligence tests in personnel work and gives 
descriptions and norms of suitable tests. Anderson (5) in his book 
touches on the use of intelligence tests in a large store, finds a slight 
correlation with sales ability, and gives a percentage distribution of 
the I.0.’s of 500 sales clerks. In the dull group (1.0. 80 to 90) 
we find 34.4 per cent, and 19.6 per cent fall even below this, whereas 
only 5.4 per cent rise above I.0. 110. Scudder (142) gave the 
Terman group test to 264 veterans being trained as accountants and 
bookkeepers. The median score for those finishing the course was 
much higher than for those who dropped out. After three to four 
years of employment 103 cases were followed up. Those scoring 
high were doing better than those scoring low, and ti:is was true also 
of the average monthly increase in wages. The author suggests 
minimum scores for these two occupations. Walch (164) gives a 
survey of the literature on mental tests used in guidance with a 
bibliography of 74 titles. Roloff (138) in Germany discusses the 
use of intelligence tests in industry and the problems of validation 
of such tests. Hartson (56) gives the intelligence ranking of the 
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occupations of 1,135 alumni of Oberlin. These rank from college 
teaching and secretarial work (high) through medicine, law, and 
so forth, down to art and physical education (low). 

Miscellaneous. Odell (117) discusses briefly the Achievement 
Quotient and decides it is too unreliable for use. Dearborn and 
Smith (30) re-scored 530 test blanks and found that 73 per cent 
contained errors, 6.4 per cent of which amounted to an error of one- 
half or more years of mental age. They analyze the different types 
of errors. Dougherty (34) presents case studies of 18 children 
tested and treated at an educational clinic. Wells (168) discusses 
testing in relation to many medical problems. Fracker and How- 
ard (42) présent correlations between the six Seashore music tests 
and intelligence for college students. Pitch and I.Q. correlate high- 
est (r= .32). Hindman (63) finds that 155 athletes out of 1,327 
freshmen college students fall below the nonathletes by 2.76 per- 
centile points and thus show no difference in intelligence. Lew- 
erenz (87) finds that high I.Q.’s do better than low I.Q.’s in visual 
education lessons, but the low I.Q.’s profit relatively more from such 
instruction. Kaulfers (72) finds a general rise in achievement in 
Spanish with rise in I.Q. The difference between the sexes is 
marked. Boys require an I.Q. about 10 points higher to equal the 
achievement of girls. In comparing 500 pairs of husbands and 
wives, Jones (70) finds a correlation of .25 for physical traits, but 
a correlation of .55 on intelligence tests. He sums up all data so far 
published on this topic. Symonds (151) discusses the choice of 
items on the basis of difficulty. Awaji (10) reports the results of 
testing 6,000 Japanese soldiers, probably a random sampling of 
Japanese males because of compulsory service. He shows the curves 
for each of the nine subtests and almost all approach the normal. 
The order of occupations is the same as found in the American 
army. He gives also distributions according to school training and 
arm of service. Estabrooks (40) takes 444 children of North Euro- 
pean stock and finds no correlation between pigmentation of eye and 
hair and intelligence. Crosland et al.(25) find no relation between 
intelligence and amount of error or amount of improvement result- 
ing from practice on the Miiller-Lyer illusion. Moriwaki (107) takes 
15 students and finds a correlation of .28 between their intelligence 
scores and the average rating of their intelligence from their photos 
by four judges. The correlation of intelligence score with rating of 
intelligence from a short interview goes up to .56. Lehman and 
Wilkerson (84) analyze the results of a play questionnaire given to 
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6,000 cases and find that C.A. is more potent than M.A. in 
influencing a boy’s play behavior. 
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BY VERNON JONES ann MASON CROOK 
Clark University 


I. GENERAL 


Since it was felt that most readers will be interested in the devel- 
opments in particular aspects of the measurement work, this review 
has been organized as nearly as possible around the different fields 
of activity. New tests which have appeared are therefore not all 
grouped together, but are discussed in connection with the use to 
which they are to be put. The most noteworthy developments this 
year have been in the refinement of diagnostic tests and in the con- 
struction of unit tests. This conclusion is based on a comparison of 
the number and significance of articles in these fields this year with 
those of previous years. 

Three general textbooks on educational measurements have 
appeared within the past year. Greene and Jorgensen (54) give a 
clear exposition of the various types of tests and the principles gov- 
erning the selection and application of them by the classroom 
teacher. This book is written from the point of view that the most 
important function of testing is the diagnosis of pupil difficulties and 
the improvement of teaching, and much material illustrative of this 
type of work is included. A comprehensive handbook on educa- 
tional test procedures has been prepared by Russell (117). This 
book gives the most complete and detailed account of the method of 
computing various scores in the age, grade, and T-score scales that 
has appeared in one volume. Carroll (24) presents an elementary 
handbook on measurement, which contains one chapter on methods 
of handling certain selected educational tests. Pintner (107), in 
his new textbook on educational psychology, makes an important 
contribution in calling attention, through his organization, to the close 
relation of educational measurement to other work in educational 
psychology. He takes the attitude that some of the most important 
of the direct contributions to this branch of psychology have come 
through the work in measurement. In his text he includes three 
chapters on educational tests as an integral part of his general organ- 
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ization, and not as a sort of addendum which has been quite a 
popular practice. 

A general but condensed treatment of tests is given by Free- 
man (47) in the “ Foundations of Experimental Psychology.” This 
chapter covers the history, construction, standardization, use, and 
evaluation and interpretation of educational tests considered as 
measures of special abilities. A bulletin by Monroe, Odell, and 
others (91), containing a lengthy review of educational research 
during the years 1918-1927, has appeared. One chapter is devoted 
to research in educational measurements. An extensive bibliography 
constitutes an important part of the work. Jones (76) gives a 
review of the developments in the field of educational tests during 
the year 1928. The bibliography includes 121 titles. Odell (98) 
has made an analysis of twenty recent publications on educational 
measurements and compiled a frequency table of references to out- 
standing contributors in that field. The men referred to with the 
greatest total frequency are Thorndike, Ruch, Terman, Monroe, 
and McCall. 

Sangren (121), in discussing the need for more adequate meas- 
ures in arithmetic, points out the present trend towards more 
analytic and diagnostic types of test, of which the Wisconsin 
Inventory and the Compass Diagnostic Tests are examples. 


II. DEVELOPMENT AND UsE or TESTS For SuRVEY PURPOSES 


(a) New Tests. The most important battery of tests for the 
elementary school level to appear during the year is the new Stanford 
Achievement Test, which was devised by Kelley, Ruch, and Ter- 
man (78). It is a complete revision of the earlier battery. Ten 
tests covering the major school subjects are included in the booklet 
for grades 4 to 9. A separate booklet containing five tests has been 
made for grades 2 and 3. Three equivalent forms of these batteries 
are now available and two more are promised. Age and grade 
norms are provided. The standardization of this series has been 
done very carefully and thoroughly. Another significant battery to 
which we wish to call attention is one prepared by Sones and 
Harry (127) for use in high schools. It provides a measure of 
general achievement in four fundamental fields—language and 
literature, mathematics, natural science, and social studies. Two 
forms are available, and they correlate with each other up to the 
extent of about .90. Norms for each semester.are given in percentile 
units. 
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As for new instruments in special fields, we may mention first 
a number of tests which have been designed for measurement of 
reading. A first grade reading scale has been prepared by Cutright, 
Van Wagenen, and others (32). Only tentative norms are available 
at present. A first grade word reading test has been constructed by 
Pressey (111) in forms A and B. It consists of words chosen from 
the Gates Primary Word List. The two forms correlate to the 
extent of .93. Norms are being determined. Williams (151) has 
devised a test to measure comprehension in reading for grades 3 to 
9. Age and grade norms are given, but so far no facts have been 
presented on reliability and validity. Payne (104) reports norms 
for reading of passages from Gray’s Oral Reading Paragraphs pre- 
sented tachistoscopically for .1 second. The norms are based on 
400 pupils in grades 2, 3, 4, and 5. Weeks (148) describes a vocab- 
ulary test which has been under construction for a number of years. 
Reliability coefficients are in the neighborhood of .90. Distributions 
of scores by ages and grades are given from which norms can be 
determined. A vocabulary test. called the Centennial Test of Word 
Meanings (66), has been worked out by Hill in connection with 
Italian-American children. It contains words from Gates’s lists. 

Several new tests and scales have been constructed for use in 
measuring language abilities. The Wilson Language Error 
Test (153), designed to aid the pupil in recognizing and avoiding 
the common language errors, has been revised. Stoddard (134) 
describes a new test of English grammar used in a survey of forty 
high schools in Iowa. The items were chosen from other tests, 
agreed upon by judges, and given a preliminary tryout on 200 
students. Reliabilities by grades range from .87 to .93. Complete 
norms are given. Results show large variability among schools. 
Davis (35) has devised a test of English fundamentals. Burch (20) 
has constructed a test which purports to measure comprehension of 
literature under the following three rubrics: action and event, char- 
acter portrayal and emotional appeal, and intellectual interest. The 
test is suitable for junior and senior high schools. Grade norms are 
available and some facts on reliability are given. Huxtable (70) 
presents some preliminary results on the construction of a scale for 
judging the thought content of written English which is being 
designed for use in high school and college. Daringer (33) describes 
a test of ability to make topical outlines. Norms based upon 400 
subjects are presented. Two forms are available, which correlate 
to the extent of .90 with each other. The Guy Spelling Scales (59) 
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designed for use in grades 2 to 9 have appeared. Three forms are 
available and grade norms are provided for each. 

There are many other subjects in each of which only one or two 
tests have appeared. The Columbia Research Bureau Algebra Series 
is now supplemented by the publication of a new section, called 
Test I (159), which is intended for use at the end of the first 
semester of that subject. The average self-correlation of this test is 
92, and its average correlation with teachers’ marks is .70. Per- 
centile norms based upon 598 students are given. A chemistry test 
by Wood and others (158) appears in the Columbia series. It covers 
elementary work in high school and college. Percentile norms based 
upon 8,000 cases are given. Facts on reliability and validity are also 
presented. Two tests in plane geometry have been constructed by 
Orleans (102), one for the middle and the other for the end of a 
year’s course. Percentile norms are given. Scores on each test cor- 
relate about .80 with teachers’ marks. The median of the self- 
correlations obtained on Test I is .85, on Test II, .71. The Junior 
American History Test by Carman, Barrows, and Wood (157) is a 
comprehensive achievement test in American history designed pri- 
marily for junior high schools. Percentile ranks and reliability 
figures based upon seventh grade pupils are given. Powers and 
Oakes (95) have constructed an important test in general biology 
for use in secondary schools. Coefficients of reliability average over 
.90. Norms are available. Elwell and Fowlkes (43) have devised a 
bookkeeping test in two parts, one for each semester. Norms are 
based upon more than 200 high school students. The reliability 
coefficients for the first and second parts are .82 and .87, respectively. 
Conard (30) describes the construction of a manuscript writing 
scale. It consists of twelve forms for use in judging pencil-written 
specimens in grades 1 to 4 and ten forms for use in judging pen 
writing in grades 3 to 6. Samples were selected by successive 
groups of 40, 26, and 10 judges. West (150) has devised a hand- 
writing scale which is especially applicable to the Palmer system of 
writing. It contains samples scaled in both rate and quality for 
grades 2 to 8. Grade norms are given. Bathurst and Scheide- 
mann (7) describe a test of knowledge of general psychology 
designed for use in college classes. Norms based upon about 400 
students in various colleges are reported. The reliability of the test 
is 80. Odell (99) reports an experiment with a scale designed to 
assist in the evaluation of answers to thought questions in four high 
school subjects. He found that ratings with the scale were only 
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slightly more reliable than ratings without it. Van Wagenen (144) 
has published a reading scale in educational psychology, with tenta- 
tive norms derived from a class of college juniors. 

A very significant extension of measurement to a new field is 
represented by the construction of the Health Education Tests 
described by Franzen (45). This represents the first application of 
the more refined methods of test construction to the field of child 
health. Norms are available for grades 5 and 6. Tables for 
obtaining standard deviation indices are given separately for boys 
and girls and for five different intelligence levels for each part of 
the test. The reliability as determined on eleven-year-old children 
is about .86. 

An art judgment test for use in junior and senior high schools 
has been prepared by Meier and Seashore (88). The test consists 
in having the subject choose the better in each of 125 pairs of pic- 
tures. Tentative norms are provided. 

Additional tests which have come to our attention are as follows: 
a physics test by Miller (90); a test of knowledge of laboratory 
technique in chemistry by Persing (106); a series of tests for 
mechanical drawing by Badger (5); a Latin grammar test by 
Hutchinson (69); a French test by Sammartino and Krause (120) ; 
an achievement scale in household science by Davis (34); a clothing 
test by Frear and Coxe (46); a general business training test by 
Smith (126). 

(b) Use of Tests in Studies of a Survey Nature. Eels (42) 
reports the frequency with which different school subjects were 
tested and the frequency with which different tests were used in 72 
surveys which were made between 1914 and 1928. Arithmetic and 
reading were tested most often. 

In 1928 about 5,000 college seniors, as well as many thousand 
seventh grade pupils and high school seniors, were given compre- 
hensive batteries of achievement tests in connection with an exten- 
sive project made possible through a codperative plan worked out 
by the Carnegie Foundation for the Advancement of Teaching (23), 
the State Department of Education, and various schools of higher 
learning in the state of Pennsylvania. Full analysis of the results 
from the college group has not been completed. A carefully devised 
plan for following up the younger groups has been put into opera- 
tion. This college achievement test, which is by far the most exten- 
sive test of its kind which has been devised up to date, is not yet 
available for general distribution. The fifth and sixth annual reports 
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of the testing programs put on by the Public School Publishing Com- 
pany have appeared (103,67). The reports give median scores 
obtained by grades and by subjects. These programs have been 
named Annual Nation-Wide Testing Surveys by the publishing 
company. The latter survey was based upon the following odd 
combination of tests: The Pressey Diagnostic Reading, the Williams 
Reading, and the Detroit Mechanical Aptitude Tests. Gerberich and 
Stoddard (48) report the results of the annual Iowa High School 
Survey, in which the test batteries making up the University of Iowa 
entrance examinations are used. Comparisons between average 
scores for all high school seniors and university freshmen show only 
slight superiority for university freshmen as a whole, but there is 
some evidence that the use of the survey results is becoming a factor 
in encouraging superior students to go to college. Abelson (1) 
reports the results of the application of the Stanford Achievement 
Test to 169 sixth grade pupils under the Dalton plan. In terms of 
accomplishment quotients the bright pupils were 2.3 points above 
the norm, the average pupils 10.5 above, and the dull pupils 9.3 above. 

Henmon (63) reports some results from the Modern Foreign 
Language Study. Norms for achievement in French are presented 
graphically for England, Canada, and the United States, the United 
States excelling in vocabulary, grammar, and silent reading, England 
excelling in composition. The differences are due in part at least 
to different arrangements of courses. Much overlapping and large 
variability were found. Shimberg (124) reports a comparison of 
urban and rural children of grades 4 to 8 on two different tests of 
information, one adapted to and standardized on 6,477 city children 
and the other based on 4,875 country children. On the test stand- 
ardized on city children the country children fell about a year 
behind; on the test standardized on country children the city chil- 
dren fell about a year behind. Van Wagenen (143) reports a state- 
wide survey of Minnesota schools which was designed to afford a 
relative evaluation of eight-month versus nine-month schools, and of 
small rural versus consolidated schools. An effort was made to test 
every measurable phase of school achievement. Appreciable superi- 
ority was found for the nine-month schools over the eight, and for 
the consolidated over the small rural schools. Wilson and Ash- 
baugh (156) conducted a survey to determine whether pupils in con- 
solidated schools showed better achievement than pupils with the 
same intelligence in one-room schools. In the study four consoli- 
dated and fifty one-room rural schools in Ohio were tested on the 
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Illinois examination. Thirty-three out of thirty-six differences in 
achievement scores favor the consolidated schools. Pratt (110) 
compared the standing of the Japanese, Chinese, Hawaiian, and part- 
Hawaiian children in Hawaii on the Stanford Achievement Test and 
found, on the basis of averages, that the Chinese made the highest 
scores and the Hawaiian the lowest, the difference between these two 
groups being almost a full grade. 

Paynter and Blanchard (105) report a study of the educational 
achievement of 330 problem children in two large cities who were 
referred to child guidance clinics for reasons other than retardation 
in school. These experimenters did not find the maladjusted chil- 
dren in the two samplings studied to be distinctly below normal 
children in educational achievement, but doubtless the factor of 
selection is a very important one. Different clinics often specialize 
in different types of cases, and it would seem, therefore, that these 
findings could be generalized to cover only those samplings where the 
type of defect studied was the same as in these two clinics. 


III. Inrenstve Stupy oF CuRRENT INSTRUMENTS AND TECHNIQUES 


In describing the construction of educational tests, Bucking- 
ham (18) strongly emphasizes the necessity for including a large 
number of standardized test items at each level of difficulty. Appre- 
ciating the excessive labor entailed by this recommendation, he sug- 
gests a short method of standardizing new material by calibrating it 
against material which has already been standardized. Douglass and 
Huffaker (38) make an analysis of the negative correlation usually 
found between 1.0. and A.O. They conclude that the negative 
correlation is spurious, due to the fact that the ratio, I.Q., is being 
correlated against a ratio of which it forms a part, as can be seen 
from the formula A.Q. equals E.Q. divided by 1.Q. Odell (97) 
writes a brief elementary account of various measures which have 
been used to express the relation between educational achievement 
and native capacity. A large part of the discussion is on the A.Q. 

Two studies have appeared which represent attempts to evaluate 
various methods which may be used in testing spelling. Pintner, 
Rinsland, and Zubin (108) studied the efficiency of two recognition 
types of tests as compared with the conventional dictation type. Both 
types of recognition tests were constructed from words of the Mor- 
rison-McCall scale. The reliability coefficient of each recognition 
form was .84, and the validities of the two forms, as shown by the 
raw correlations with the Morrison-McCall scale, were .69 and .74, 
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respectively.* These facts show that the recognition method yields 
results which are accurate enough for group measurement. 
Guiler (58) reports a study on practically the same problem. He 
worked with oral recall, written recall, and recognition. Words from 
the upper ranges of the Buckingham extension of the Ayres Spelling 
Scale were used on about 700 students ranging from the seventh 
grade to college freshmen. His figures indicate, on the basis of 
averages, that written recall is the most effective in detecting lack of 
word mastery, though there is considerable variability among indi- 
viduals, and facts of reliability of differences are not given. 

Wilson (154) presents a critical examination of the new Stanford 
Spelling Test with respect to its usefulness in serving the main 
curricular aim of the subject, and in reinforcing good methods of 
teaching. He concludes, from comparison with the Thorndike Word 
300k and the Commonwealth List, that the test contains many words 
of unreasonable difficulty and therefore does not satisfy the funda- 
mental criteria of a good test. Wilson and Parsons (155) make a 
similar analysis of the spelling test used in the so-called Nation-Wide 
Test of the Public School Publishing Company, and reach quite the 
same conclusion with respect to it that Wilson reached with respect 
to the Stanford test. 

Henmon (62) presents a wealth of facts about the American 
Council Alpha Tests in modern languages. This series contains well 
standardized tests in French, German, Spanish, and an experimental 
form of a test in Italian. New facts are also given on the Twigg 
French Vocabulary Test in a section written by Beatley. The data 
given indicate that the Twigg test correlates more highly with teach- 
ers’ marks than does the French test of the American Council series. 
Stoddard (134) has published a description of the origin and use of 
the Iowa Placement Examinations. Gillespie and Brotemarkle (49) 
present revised percentile norms, based upon the testing of several 
hundred college students, for several intelligence tests and a few 
educational tests, including the Courtis Arithmetic Test and the 
Trabue Language Scale J. Thornton (140) gives an analysis of the 
content of ten standard history tests. He concludes that the Hahn 
and Pressy-Richards tests are superior from the standpoint of dis- 






















*It is obviously impossible to interpret accurately coefficients of correla- 
tion without knowledge of the range in age or grade upon which the correla- 
tions were based. However, in the limited space available for this review 
it is not possible to include more facts than one or two raw correlations on 
reliability and validity. 
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tribution of items over historical periods, and the Van Wagenen is 
superior from the standpoint of relative emphasis upon different 
fields, such as political, social, and economic. Engelhart (44) gives 
a summary of useful information on ten standardized tests in the 
field of education, with a critical evaluation of each. Speer (129) 
describes an experimental evaluation of seven composition scales: 
the Hudelson English, the Hudelson Typical Composition, the 
Leonard, the Lewis, the Nassau County Supplement of the Hillegas, 
the Van Wagenen, and the Willing scales. Rated in terms of the 
variability of scores assigned to the same specimens by different 
judges using the scale, the Nassau Supplement to the Hillegas Scale 
is the most satisfactory, and the Leonard Scale comes second. 
Lyman (83) discusses the development of composition scales, and 
suggests that the trend is towards more analytic types of scales which 
do not measure the combined product of so many complex factors. 
Douglas and Lawson (37) present some comparative facts concern- 
ing three important reading tests: the Sangren-Woody, the Monroe, 
and the Haggerty. The Sangren-Woody was the most successful 
in distinguishing among accelerated, average, and retarded groups in 
junior high school. These groups had been formed originally on the 
basis of a combination of teachers’ ratings and objective measures of 
mental ability. Intercorrelations were also computed among these 
three tests, and they were found to range from .56 to .75. Stab- 
ler (130) studied the validity of the Hill tests of civic attitudes and 
civic information in the case of 120 junior high school pupils. He 
found a correlation of —.41 between scores on the tests and number 
of civic deficiencies of the pupils as rated by the teachers. The cor- 
relation between scores on the Hill tests and intelligence quotients 
was found to be .27. Markt and Gilliland (87) report some statistical 
facts obtained in the use of the George Washington Teaching Apti- 
tude Test with 288 education students. Test results were found to 
correlate with ratings of the girls in practice teaching to the extent of 
only .12. A reliability coefficient of .64 was obtained. From the 
correlations it appears that the test measures intelligence as much as 
anything else. It correlates to the extent of .60 with the Otis 
Advanced Intelligence Examination, while the correlation between 
the two duplicate forms is only .64. 

The Seashore musical test formed the subject for two investiga- 
tions. Highsmith (65) studied the results of the Seashore Test of 
Musical Ability applied to 59 students in a university school of music, 
and found on the basis of correlations that it measures something 
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rather reliably, but that this something is not the same thing which 
is measured in this school by marks in either applied or theo- 
retical music. The reliability of the test was found to be 89. The 
correlation between test results and marks in applied music was only 
.31, and between test results and marks in theoretical music, .49. 
Of course this latter correlation is not very much lower than that 
between intelligence test results and school marks in the three R’s, 
and therefore the lack of agreement cannot be condemned as being 
unusual, but the fact remains that in this particular school the Sea- 
shore test cannot be used with much assurance for individual prog- 
nosis and guidance. Broom (13) investigated the degree of agree- 
ment among various measures of students’ abilities in music as 
yielded by the different parts of the Seashore Test of Musical Apti- 
tude. He worked with 102 ninth grade pupils and 82 college juniors 
and seniors. The agreement among the different tests was found to 
be significantly greater for the high school group than for the college 
group, the intercorrelations for the former group ranging from .83 
to .96, those for the latter group ranging from .08 to .41. 





IV. DEVELOPMENT AND UsE or TESTs FoR IMPROVING MARKS AND 
MARKING SYSTEMS 











The main practical contributions on the topic of improving marks 
have been in the nature of studies on the best methods of obtaining 
accurate records of students’ abilities by means of examinations. 
The discussion and study of the so-called new-type examination 
almost overshadows everything else in this connection. Ruch, in a 
recent book (116), gives a comprehensive summary of the theoretical 
considerations and experimental work bearing upon new-type exam- 
inations. It gives the most complete account that has so far 
appeared in one volume concerning the advantages, limitations, con- 
struction, and use of each type of examination. One chapter is 
devoted to relative values of standardized and non-standardized tests, 
and another to examination marks and marking systems. A small 
book by Cocks (29) has appeared on the value of true-false exam- 
inations. Practical suggestions are given concerning the use of this 
type of test in physics. Talbott and Ruch (136) discuss the essay 
and short-answer types of tésts as methods of sampling student 
knowledge, and present an experimental study of the relative effi- 
ciency of these two types. The experiment consisted in giving essay- 
type examinations and very thorough short-answer examinations on 
the same chapters of various texts being studied by the classes. They 
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conclude from their results that as a means of sampling factual 
knowledge the essay type is only about one-fourth as efficient as the 
short-answer type. Sims (125) studied the reliability of four types 
of vocabulary test—identification, multiple-response, matching, and 
checking. He gave by all four methods a list of 70 words selected 
from the Thorndike Test of Word Knowledge to 110 school chil- 
dren in grades 5 to 8. The reliability coefficients, as determined by 
the split-half method, are: identification, .92; multiple response, 
84; matching, .93; checking, .92. Intercorrelations of the various 
types range from .54 to .85, the checking test showing up poorest. 
On the basis of his data, the author feels that, all things considered, 
the matching test gives the most satisfactory group measure of 
vocabulary. Jones (74) makes a comparison between a multiple- 
choice test and a true-false test. He obtains a correlation of 82 
between the two types. Edmiston (41) gets some results on a very 
limited sampling which would lead one to believe that such inter- 
correlations among the various new-type tests are higher for bright 
children than for dull ones. Greene and Lane (53) have devised 
some interesting new-type tests for measurement and instruction in 
plane geometry. Wrinkle (161), in a discussion of school marks, 
suggests that objective test results should serve as one measure of 
objective achievement, and that reports on other factors in which 
parents are interested should be given separately and not incor- 
porated in the achievement mark. 

An interesting extension of this problem of the use of tests in 
marks and marking systems has been indicated by Taylor (137), 
who studied the marks or ratings assigned to teachers. By means 
of partial and multiple correlations he studied the correspondence 
between ratings of teacher-ability by supervisors and achievement 
scores obtained by the pupils of these teachers. Each teacher was 
rated five times. Progress of the pupils in arithmetic and reading 
comprehension was measured over a period of time. It was found 
that the agreement between teacher ratings and pupil progress was 
not large in either subject, but was a little higher in reading than in 
arithmetic. 


V. Use or Tests 1n Pupit CLASSIFICATION 


One of the most important contributions that has come from the 
testing movement has been the demonstration of individual differ- 
ences. One of the earliest uses for which test results were recom- 
mended was in connection with the classification of pupils into groups 
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which would be as much alike as possible in ability to progress. It 
is interesting to note that the problem of the desirability and suc- 
cess of the use of tests for classification of pupils still affords a lively 
topic for discussion and experimentation. The experimental results 
reported this year are no less conflicting than the opinions expressed. 
None of the experiments conducted this year is conclusive, because 
each is vulnerable for one or more of the following reasons: (1) 
The outcomes which are measured are not such as would be agreed 
upon by competent scholars in the field of philosophy of education as 
a reasonable sampling of all important outcomes to be considered ; 
(2) variables indicated in (1) which are not in the experimental 
factor are not satisfactorily controlled; and (3) various parts of the 
experimental factor are not satisfactorily measured. 

Dvorak and Rae (39) report a study of achievement of pupils in 
segregated and non-segregated groups in the first grade. The segre- 
gated group made greater progress in reading, but less in spelling, 
than the other group. An examination of the teaching methods 
involved leads the experimenters to suggest that the usefulness of 
homogeneous grouping depends upon teaching methods adapted to 
the ability of the pupils. Purdom (113) conducted an experiment 
covering a period of one semester for the purpose of determining the 
effect of homogeneous grouping upon progress in English and 
algebra. Two groups, one relatively homogeneous and the other 
relatively heterogeneous, were used. These were equated in age and 
intelligence and taught by the same teachers. The author interprets 
his results as being unfavorable to homogeneous grouping, but his 
differences were small and not consistently in favor of either group. 


McGaughy (85) reviews the advantages and disadvantages of homo- 
geneous grouping and in general feels that the disadvantages out- 
weigh the advantages from the point of view of the well-rounded 
development of the greater number of pupils. Rock (115) surveys 


at length current practices in sectioning on the basis of ability. He 
concludes that actual experimental studies do not at this time show 
any advantage for ability grouping, on the whole, but he says it must 
be borne in mind that grouping up to this time has not always been 
accompanied by the necessary modifications in teaching procedure. 
Tharp (138) describes an experiment on ability grouping in college 
classes for foreign language instruction. The sectioning was done 
by means of the Otis Group Test and the lowa Placement Examina- 
tion. The results achieved in the group which was so sectioned, as 
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compared with the control group, indicate small differences in favor 
of the grouping method. 

Kefauver (77) discusses bases for forming ability groups and 
concludes that any measure for making such groupings should 
include (1) a composite of teachers’ ratings, (2) intelligence scores, 
and (3) general achievement test scores. Heilman (60) gives some 
figures which bear on the relative importance which should be 
attached to various factors other than educational test results in the 
formation of homogeneous groups. He reports an experiment 
designed to investigate the part played by mental age, school attend- 
ance, and socio-economic status in determining individual differences 
in educational age. From study of the records of 828 ten-year-old 
school children the author concludes that intelligence as measured is 
eight times as important as either school attendance or socio- 
economic status in accounting for variability in educational age. 
Wright (160) reports the success with which standardized achieve- 
ment tests are used in Indiana for promotion from elementary to 
secondary schools. 


VI. Tue Use or Tests 1n DIAGNOosTIC AND REMEDIAL TEACHING 

(a) Projects Involving the Use of Diagnostic and Remedial 
Methods. A number of remedial projects have been reported which 
involved the preliminary use of tests as a diagnostic measure. 
Guiler (56) describes an effort to improve the handwriting of sev- 
enth and eighth grade pupils. The Gettysburg Edition of the Ayres 
Handwriting Scale was used to rate the ability of the pupils at the 
beginning of the experiment. Difficulties were analyzed by means of 
Gray’s standard score card for measuring handwriting, Pressey’s 
chart for diagnosis of illegibilities in handwriting, and West’s chart 
for diagnosis of elements of handwriting. Specific training was 
given over a period of twelve weeks, during which time it appears 
that the average quality of the handwriting in the group improved 
from the third to the sixth grade level. Guiler (55) also describes 
a remedial project in arithmetic with seventh grade pupils. He con- 
cludes that about one full grade’s work in certain measurable skills 
was accomplished in twelve weeks of directed practice. Chase (27) 
reports an intensive study of the arithmetic difficulties of twelve 
pupils of a junior high school. A number of tests were used for 
diagnosis of difficulties, and these were followed by an analytic oral 
examination of each pupil. Ninety days of individual remedial treat- 
ment was given, and at the end of this period much improvement was 
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found, amounting in some cases to as much as three years. Camp 
and Allen (22) used a modified form of Gray’s Standardized Oral 
Reading Check Tests in an endeavor to diagnose the difficulties in 
oral reading of 170 children of grades 2 to 7. Following the diag- 
nosis, remedial work in syllabification and phonetics was given with 
apparent success. Carroll and Jacobs (25) describe a project in 
remedial instruction in reading with a group composed of college 
freshmen. The material used was selections taken from textbooks in 
different subjects, and arranged in a manner similar to that of the 
Thorndike-McCall Reading Scale. They report a gain for the drill 
group of 72 per cent in speed during the experimental period of four 
weeks. Guiler (57) presents achievement ratings in different sub- 
jects for a group of college freshmen and illustrates the general 
method of remedial instruction which was used in spelling and punc- 
tuation. His figures show that large gains were made in these two 
subjects. Jones (75) describes a project designed to keep the 
achievement of college students up to their ability levels as diagnosed 
by aptitude tests. The main measurements used were scores from 
achievement tests, scores from aptitude tests, and quarterly grades in 
courses. The efficiency with which a given student was working was 
measured by a comparison between aptitude scores, as shown by 
parts of the Iowa Examination, and school achievement. In the 
judgment of the reviewers, the reliabilities of the two measures 
which formed the basis of the comparison were not taken into 
account sufficiently. 

Jacobson and Van Dusen (72) report the results obtained from 
the use of the McCall-Crabbs Standard Test Lessons in Reading in 
remedial work. These remedial exercises were given once a week to 
122 high school freshmen who had been found to be below their 
grade norm. The average grade score of this group as determined 
on the Iowa High School Silent Reading Test was 7.1 at the begin- 
ning of the experiment. At the end of about six months the same 
test was given again to the 102 pupils of the group who were still 
in school, and the average grade score was found to be 9.2. 
Rhoads (114) also used this well-known series of reading exercises 
in a study in which he attempted to determine whether wide reading 
for appreciation or practice in intensive reading is the better method 
of improving ability to understand words, sentences, and paragraphs. 
Two seventh grade classes of equal numbers and about equal capaci- 


ties were used, one for the experimental group and the other for the 
control. The improvement, as measured by the reading tests of the 
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Stanford Achievement Battery, was larger for the group which had 
practiced intensive reading. Maloney and Ruch (86) report an 
experiment to test the pedagogical value of tests in grammar courses. 
Four hundred and ninety-seven pupils in the ninth, tenth, and 
eleventh grades were divided into three groups in each grade, and 
grammar was taught in the three different divisions by different 
methods for a period of ten weeks. In one section the textbook 
method was used, in another section objective instructional tests were 
used, and in the third both methods were used. Achievement scores 
at the end of the period showed the test method to be the best and 
the textbook method to be the poorest. Weinberg (149) has written 
a brief discussion of the test method in teaching. Washburne (147) 
presents some interesting material on the value of self-measurement 
as an aid to learning and recall. He finds that social-science reading 
material with questions attached is mastered better, by a significant 
margin, than material with no questions. Moreover, he finds that 
the best position for questions is at the beginning of a section and 
that the poorest is at the end. 

(b) New Diagnostic Tests. The Pressey Diagnostic Reading 
Test (112) has been designed to measure reading vocabulary in 
grades 1 to 3. It is available in two forms. Tentative norms are 
presented. A reliability figure of .92 was obtained on third grade 
pupils. The same authors have devised diagnostic tests for analysis 
of reading difficulties in grades 3 to 9. Facts on reliability and 
tentative grade norms are given. Ingraham and Clark (71) have 
prepared a diagnostic reading test for primary and intermediate 
grades. Age and grade norms are given, and some facts on relia- 
bility. Bacon’s Diagnostic Tests in Latin (4) have been published. 
These tests are based upon Gray and Jenkins’ “Latin for Today: 
First-Year Course,” and are intended as an aid to teaching. No 
facts on norms or reliability are given. Powers has devised a Diag- 
nostic Latin Test (109) to cover the first year of Latin. Tentative 
norms are given for students having one and one-half semesters of 
Latin. Reliabilities of the various parts range above .80. One of 
the most comprehensive tests of grammar and language abilities that 
has yet appeared is the new test by Greene and Ballenger entitled 
the Iowa Elementary Language Test (52). Age and grade norms 
are given for grades 4 to 9. Reliabilities have been determined for 
various parts of the test. A Diagnostic Test in Whole Numbers and 
a Diagnostic Test in Decimals have been published by Brueckner (16). 
The same author, in collaboration with others, has issued a set of 
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Diagnostic Tests and Practice Exercises in Arithmetic (14) which 
they recommend for use with any basic arithmetic text in grades 
3 to 8. 

(c) Analysis of Abilities. There have been several studies 
which indicate attempts to analyze broad and general abilities into 
more unitary ones. Such studies and analyses are included under 
this section because they are sometimes the result of diagnostic test- 
ing and sometimes they pave the way for diagnostic testing. Miles 
and Segel (89) studied the relation of eye movements to reading 
ability in third grade pupils, and found that a composite eye-move- 
ment score correlated .66 with the Gates Silent Reading Test, .75 
with the Gates Primary Reading Test, and .70 with teachers’ ratings. 
Symonds and Lee (135) have taken 616 compositions which had 
been carefully rated on the Hillegas Composition Scale, and analyzed 
out knowledge of punctuation, capitalization and vocabulary shown 
at each grade level. Wilson (152) has prepared inventory and 
diagnostic tests in arithmetic. 

(d) Tests to Measure Units of Instruction. The Glenn-Obourn 
Instructional Tests in Physics (50) have been completed after sev- 
eral years of experimental work. The series consists of twenty-five 
different units which may be used in any order. It is more important 
as an instructional device than as a measuring instrument. How- 
ever, tentative norms in percentile units are given. A similar test 
for use in chemistry has been devised by Glenn and Welton (51). 
A unit achievement test in plane geometry has been prepared by 
Lane and Greene (79). Norms and reliability were computed from 
the results of testing over 6,000 children. Blaisdell (8) has devised 
an instructional test in biology consisting of twenty-five parts and 
designed to cover one year’s work. Percentile tables are given for 
aid in interpreting results. Branom (10,11) has prepared two sets 
of tests in geography for instructional purposes—one for elementary 
work and one advanced. These are similar to his earlier series of 
so-called practice tests. Brueckner (15) has prepared what he calls 
curriculum tests in arithmetic, covering grades 3 to 8. They are 
standardized for use at the end of each month. 

Instructional tests of the unit type designed to accompany par- 
ticular text-books are becoming available in greater numbers. The 
tendency may be illustrated by the diagnostic and measurement tests 
constructed by Sterling and Cole (133) to accompany “ English for 
Daily Use,” and A Standardized French Test, designed by E. B. 
De Sauzé (36) to cover lessons in “ Cours Pratique.” Sangren and 
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Marburger (122) have devised a set of instructional tests in physics 
consisting of a preliminary diagnostic test, twenty-two instructional 
tests, and a final examination. The content is based upon five com- 
monly used physics texts. Tentative norms are given. 


VII. Tae DeveLopMENT AND Use or TEsTs For PROGNOSIS AND 
GUIDANCE 

Brigham (12) submits the fourth annual report of the Commis- 
sion on Scholastic Aptitude Tests for College Work. The test 
used and the findings are quite similar to those of the previous year. 
Facts from typical college groups indicate that scores on the test 
correlate with standing at the end of the first year in college to the 
extent of .52. All tests designed to measure mathematical aptitude 
have been omitted from the examination since 1927. It is interesting 
to note that the Commission now feels that a supplementary section 
of the test designed to measure this ability, or group of abilities, 
would make the test more useful for the selection of scientific 
students. Condit (31) reports a study of the classification of 559 
freshmen entering a teachers’ college by means of the Thurstone 
Examination and the Whitney-Heilman-Woody Entrance Examina- 
tion. Average scholastic ratings were found to correlate .45 with the 
Thurstone Examination and .50 with the Whitney-Heilman- Woody 
Examination. Gerberich and Stoddard (48) report that ratings 
obtained by high school seniors on the Iowa survey test correlate 
frum .50 to .59 with first semester grades received in college. Long- 
staff and Porter (81) give a complete list of partial, multiple, and 
zero order correlations for the following five variables: average 
of fourteen tests in psychology, final examination in psychology, 
score on Otis Self-Administering Test, average semester grade in 
college excluding psychology, and average high school grade. 
207 university students, mostly freshmen, were used as subjects. 
The correlations of the zero order range from .14 to .58. 
Stanton (131) has written a monograph describing the use of tests 
of musical prognosis at the Eastman School of Music. Applicants 
are now classified into five groups on the basis of combined scores 
on the Seashore Measures of Musical Talent and the lowa Compre- 
hension Test. A follow-up study of 351 students of the classes of 
1925, 1926 and 1927 has, in the minds of the school officials, justified 
selection on this basis. Sorenson (128) reports a study of the rela- 
tion between success in various high school subjects and college 
success. The one high school subject found to be most highly cor- 
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related with general college success is Latin. French and mathe- 
matics were not very far behind. The interpretation of Sorenson’s 
findings is, of course, made very difficult by the uncontrolled factor 
of the selection of native abilities involved. 

A medical aptitude test has been prepared by Moss, Hunter and 
Hubbard (94). It has been given in about one-third of the class A 
medical schools of the United States. Statistical data are not yet 
available. Morris (93) has contributed what appears to be a prom- 
ising device for predicting success in teaching in elementary and 
secondary schools. A considerable amount of experimentation has 
been done, the results of which are described in her thesis (92). 
The facts so far reported are very encouraging, and it is hoped that 
further follow-up studies are under way which will help us to evalu- 
ate more definitely the practical usefulness of this device. Hunt (68) 
describes the teaching aptitude test of the George Washington Uni- 
versity Series. The reliability obtained by the split-half method 
from the scores of 100 teachers’ college students was .91. Correla- 
tions with grades in teacher-training courses range from .50 to .55. 
Correlations with ratings assigned by supervisory officers to teachers 
on the job range from .30 to .50. Tentative norms are given. 
Boardman (9) describes an attempt to devise a set of tests which 
would predict success in teaching. He includes in his battery an 
intelligence test, a test of professional information, and a test of 
teaching procedure. His criterion of teaching efficiency is a com- 
bination of ratings from superiors, colleagues, and pupils. The 
multiple correlation between the criterion and the weighted sum 
of the tests is .51. 

Henmon (64) summarizes contributions made to aptitude testing 
in the Modern Language Enquiry. He reports that the best aptitude 
test now available in modern languages will not correlate over .65 
with grades received in language work. He feels that the best 
basis at present for prediction of success in a language is a com- 
bination of aptitude test scores and scores on achievement tests in 
that language after a trial period. In this report Brigham presents 
the facts for the Princeton Artificial Language Test, Hopkins gives 
the figures for the Wilkins Prognosis Test, and Symonds describes 
the construction of the Symonds Foreign Language Prognosis Bat- 
tery. Two important prognosis tests have been prepared by 
Orleans (101, 100), one for geometry and the other for algebra. In 
the case of each, validity was measured by correlating scores on 
the test with later achievement in the subject. For the former 


> eS 


niet ae 


i ae 





476 VERNON JONES AND MASON CROOK 


test the correlation was about .80; for the latter the correlations 
ranged from .71 to 82. Schneck (123) studied the degree to which 
numerical and verbal abilities are independent of each other. Five 
tests of verbal ability and four tests of numerical ability were given 
to 210 college students. The two abilities were found to be largely 
independent, the correlation between them being only .26. Bucking- 
ham (19) reports the results of an interview-test designed to meas- 
ure the knowledge of numbers of children entering the first grade. 
The test is concerned with counting, number concepts, and number 
combinations. From examination of 1,000 children the author con- 
cludes that there can be little doubt that first grade pupils are ready 
for the teaching of number. 

Zyve (162) has published an aptitude test for ability to do 
scientific research. A correlation of .74 between test scores and 
ratings by judges on a group of fifty graduate science students is 


reported. Correlations given by him in 1927, from a preliminary 


study of this test with three different small samplings, were .77, .89, 
and .95. The test correlates only .51 with the Thorndike Intelligence 
Examination for a range extending from college freshmen to gradu- 
ates. The author calls special attention to the difference between 
the correlation of .74 and the one of .51, concluding from it that 
the test measures some capacities involved in scientific aptitude other 
than general intelligence as measured by the Thorndike test. Salis- 
bury and Smith (119) describe the construction of a prognosis test 
of sight-singing ability. The multiple correlation between an 
empirical criterion of sight-singing ability and the weighted sum 
of pitch, dictation and tonal memory was found to be .84. Wallin 
and Coles (146) present a phonetic spelling scale of 459 words 
designed to test innate spelling ability. Norms are given in: both 
percentage and standard deviation units. Limp (80) reports sta- 
tistical facts obtained from a try-out of prognosis tests of shorthand 
and typewriting which he is constructing. Stedman (132) also 
presents a study on the prognosis of typewriting ability, but none of 
the tests which he tried showed encouraging results. Three new tests 
designed to predict success in shop work have been reported: the 
Detroit Mechanical Aptitude Examination by Baker and Crock- 
ett (6), a shop-work test by Vickers and Hoskins (145) and a test 
of mechanical ingenuity by Briimmer (17). 
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VIIIL. BrsLioGRAPHIEs 


One general bibliography, covering research studies in education 
up to June 30, 1927, has been published by the United States Bureau 
of Education (142). An anonymous bibliography (2) of 38 titles 
on educational and mental tests for the blind has appeared. 
Odell (97) has published an annotated bibliography of 300 titles on 
examinations and school marks. 
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Yale University 


About thirty of the reports appearing in 1929 were based on new 
tests and techniques. Very little was written about ratings, but 
the use of controlled observation as a scientific instrument was indi- 
cated in eleven studies. There is an increasing tendency to use 
existing techniques for the solution of problems. That is, the center 
of interest is gradually shifting from tools to problems. The total 
number of articles is far smaller than for several years. 

Summaries. In addition to the writers’ summary (71), Burn- 
ham (13), Fisher (27), Furfey (31), Sister McDonough (67), and 
Watson (115) provide lists of studies. 

Behaviors and Traits. Ratings and measures of groups of 
tendencies are reported by Decroly and Wauthier (22), Hen- 
ning (47,48), Jordan (58), McClure (66), South and Clark (95), 
Steen and Huntington (98), Steere (99), and Vernon (111). Dif- 
ferences between identical twins in sundry traits were studied by 
Newman (78,79). Inferiority feelings and emotional instability are 
dealt with by Ball (3), and Gardner and Pierce (32). 

The Colgate inventories, Conklin’s device, and the Freyd-Heid- 
breder list have been used in whole or in part in a variety of experi- 
ments by Broom (11), Brownell (12), Campbell (15), Estabrooks 
and Huntington (26), Garrett (33), Hovey (51), McClatchy (65), 
South and Clark (95), Weber and Maijgren (116), Wetmore and 
Estabrooks (117), and Whitman (120,121). New devices for 
measuring introversion-extroversion are reported by Newcomb (77), 
and Neymann and Kohlstedt (80). 

Oates (81) continues his studies of the Downey tests, and Har- 
rell and Davis (40) discuss their application to institutional children. 
They appear also in Newman’s studies of twins (78,79), and Peter- 
son and Lanier’s (84) study of negroes and whites, and in a report 
by Richardson (92). 

Pressey’s X-O tests, the Woodworth questionnaire and the Kent- 
Rosanoff association tests were used by Newman (78,79); Weber 
and Maijgren (116) included the X-O and association tests in their 
battery; Dickinson (23) used the Woodworth-Mathews, Wood- 
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worth-Cady and Kent-Rosanoff tests in a study of stutte _; the 
X-O was used by Flowers (29), and norms for the Kent-_.osanoff 
are reported by O’Connor (82). Town (108) reports a new test 
of emotional balance. Elonen and Woodrow (24) attempted to 
validate various Kent-Rosanoff measures; Flemming and Flem- 
ming (28) studied the validity of the Woodworth-Mathews device; 
and Gorham and Brotemarkle (36) investigated the validity of the 
X-O, Brotemarkle, and Downey tests. 

Confidence was measured by Jersild (55) and Greene (37) ; sug- 
gestibility by Estabrooks (25), Hull (52), McGeoch (69), and 
Young (127); social participation by Stoke and Cline (100) ; opti- 
mism and pessimism by Jasper (54); social facilitation by Ander- 
son (2); resourcefulness by O’Rourke (83); persistence by Poul 
and Montgomery (86); honesty by Bathurst eft al.(6), Cros- 
land (19), and Tuttle (109); social temperament by Bathurst (5); 
perseveration by Cushing (20); negativism by Goodenough (35) ; 
codperation, helpfulness, persistence and inhibition by Hartshorne, 
May and Maller (42) ; foresight and control by Washburne (113) ; 
self-estimation by Schutte (94) and Jackson (53); accuracy and 
speed by Pollock (85), using the game called Guidit. 

Interests, Attitudes and Opinions. <A technique of measuring 
interest quantitatively is reported by Wyman (126). Whitley (119) 
challenges Lehman and Witty’s conclusions about the collecting 
interests of children. Vocational interests were studied by Rem- 
mers (88), Lehman and Witty (63), Strong (101, 102), and Strong 
and MacKenzie (103). Morris’ test (76) of the personality of 
teachers is now available. The effects of various incentives on 
accuracy of discrimination were measured by Hamilton (39). 

Attitude toward monotony was studied by Thompson (104); 
changes in attitudes effected by a summer camp are reported by 
Statten (97); sex differences in school attitudes are reported by 
Witty and Lehman (125) ; Hersey (50) studied periodic changes in 
feeling of happiness of male workers. Allport (1) reports a study 
of political attitudes; Wilkinson (122) investigated differences in 
social attitudes among several occupational groups. Tuttle (109) 
and Stabler (96) used the Hill Civic Attitude test; Fligel (30) 
studied attitudes toward modern clothes. The Purdue rating scale 
for teachers was used by Remmers (87) ; Thurstone and Chave bring 
together in a monograph (106) their work on the measurement of 
attitude; the attitude of mothers towa _d sex education is reported 
by Witmer (123). 
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je@les (56, 57) has two articles on opinions of right and wrong 
as heki-by teachers and children. Miller (72) studied superstitions 
amongicollege students. Wheeler and Jordan (118) noted the effects 
of group opinion on individual opinion. Watson (114) reports the 
opinion of boys about a series of services of worship. Boynton (9) 
notes the relation of moral judgments to intelligence. 

Information and Ability. Religious ideas were measured by 
Tuttle (109) with a test of Watson’s. Moran (75) sought to dis- 
cover relations between moral judgment and Biblical knowledge. 
Carmichael (16) reports the knowledge shown by first grade chil- 
dren of conventional responses to imaginary situations common to 
six-year-olds. Guilford (38) and Landis (61) report studies in the 
ability to read facial expressions. Garrett (33) used the George 
Washington Social Intelligence test in his study of race differences. 
Valentine (110) made a study of intuitive judgment of character 
by men and women. 

Physiological Tests and Physical Type. Chappell (17) reports 
a study of the effects of deception on blood pressure. Hatha- 
way (43) investigated the relation between association time and the 
psychogalvanic reflex, using a new psychogalvanic apparatus. Landis 
and Slight (62) report cardiac responses in emotional reactions. 
Henry (49) investigated the relation of emotional state to basal 
metabolism. Rich (9) discusses body acidity in relation to emotion. 
Mohr and Gundlach (73) report the relation of the Kretschmer type 
classification to types of crime. 

Observation. Hartshorne (41) describes a technique for guiding 
thé observation of social functioning. Bower (7,8) and Chave (18) 
report a variety of instruments for gathering facts about conduct 
and interest. 

The behavior patterns of smiling and laughing in infants were 
studied by Washburn (112). Gesell and Thompson (34) noted the 
differential effects of training on the emotional behavior of infant 
twins. Negativism in preschool children was studied by Rey- 
nolds (90). Careful record of how 14 four-year-old boys and girls 
spent their time was kept and reported by Bridges (10). Rugg, 
Krueger and Sondergaard (93) report a study of traits of kinder- 
garten children as inferred from observed behavior. Civic defi- 
ciencies of ninth grade children as observed on 56 occasions was 
compared with civic information and attitude and school achieve- 
ment by Stabler (96). Responses of boys to 30 situations and to 
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incidental occurrences was made the basis of a study of consistency 
of extrovert-introvert tendencies by Newcomb (77). 

Discussion. Basov (4) discusses the developmental course of 
the processes of behavior. Butterfield (14) raises questions about 
honesty tests. Dearborn (21) lists tests best adapted to new psy- 
chiatric uses. Henning (44, 45,46) comments on a variety of tech- 
niques for measuring character. Kneeland (60) contributes to the 
evaluation of rating procedures. Liber (64) makes some suggestions 
for a psychological study of ethics. McDougall (68) discusses the 
chemical foundations of temperament. McClatchy (65) criticises 
the concept of social intelligence. Matthews (70) discusses the 
influence of the position of a multiple choice response on its selection 
by the subject. Moon (74) reports a study of the reliability of 
ratings. Reynier (89) notes the need for tests of preference. Thur- 
stone (105) expounds the mathematical principles involved in scaling 
attitudes. Witty and Lehman (124) continue their controversy with 
Woodrow on the nature of character tests. 
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NOTES AND NEWS 


Proressor Gorpon W. ALLport of Dartmouth College has been 
appointed Assistant Professor of Psychology at Harvard University 
and will take up his new duties in the fall of the present year. The 
appointment is designed to lead to the development of courses and 
research in social psychology at Harvard. 


Dr. THEopore F. KarwoskI, recently National Research Fellow 
in the Biological Sciences at Harvard University, has been appointed 
Assistant Professor of Psychology at Dartmouth College. 

The Administrative Board of the Wayne County Training 
School announces the appointment of Dr. T. G. Hegge as Director 
of Research. Dr. Hegge has also been appointed Special Lecturer 
in the Department of Psychology at the University of Michigan. 
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